AITopics | spoken language

Collaborating Authors

spoken language

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Unsupervised Learning of Spoken Language with Visual Context

Neural Information Processing SystemsNov-21-2025, 14:57:42 GMT

Humans learn to speak before they can read or write, so why can't computers do the same? In this paper, we present a deep neural network model capable of rudimentary spoken language acquisition using untranscribed audio training data, whose only supervision comes in the form of contextually relevant visual images. We describe the collection of our data comprised of over 120,000 spoken audio captions for the Places image dataset and evaluate our model on an image search and annotation task. We also provide some visualizations which suggest that our model is learning to recognize meaningful words within the caption spectrograms.

name change, spoken language, unsupervised learning, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Low-Resource NMT: A Case Study on the Written and Spoken Languages in Hong Kong

Mak, Hei Yi, Lee, Tan

arXiv.org Artificial IntelligenceMay-26-2025

The majority of inhabitants in Hong Kong are able to read and write in standard Chinese butuse Cantonese as theprimary spoken language in daily life. Spoken Cantonese can be transcribed into Chinese characters, which constitute the so-called writte n Cantonese. Written Cantonese exhibits significant lexical and grammatical differences from standard written Chinese. The riseof written Cantonese is increasingly evident in thecyber world.The growing interaction between Mandarin speakers and Cantonese sp eak-ers is leading to a clear demand for automatic translation between Chinese and Cantonese. This paper describes a transformer-based neural machine translation (NMT) system for written-Chine se-to-written-Cantonese translation. Given that parallel text data of Chinese and Cantonese are extremely scarce, a major focus of thi s study is on the effort of preparing good amount of training dat a for NMT. In addition to collecting 28K parallel sentences from previous linguistic studies and scattered internet resources, we devise an effective approach to obtaining 72K parallel sentences by automatically extracting pairs of semantically similar senten ces from parallel articles on Chinese Wikipedia and Cantonese Wikip edia. We show that leveraging highly similar sentence pairs minedfrom Wikipedia improves translation performance in all test set s. Our system outperforms Baidu Fanyi's Chinese-to-Cantonese tr ansla-tion on 6 out of 8 test sets in BLEU scores. Translation exampl es reveal that our system is able to capture important linguistic transformations between standard Chinese and spoken Cantonese.

artificial intelligence, cantonese, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.17816

Country: Asia > China > Hong Kong (0.64)

Genre: Research Report (0.40)

Industry: Media (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Reviews: Unsupervised Learning of Spoken Language with Visual Context

Neural Information Processing SystemsJan-20-2025, 14:44:42 GMT

This is interesting work that is pointing into the right direction, but a few aspects of this paper are a bit problematic: 1) It would have been useful (or interesting) to use a corpus that has existing text captions, and either have users re-speak the text captions, or collect additional captions. The data collections seems generally well thought-out, but why was the Places205 data set used? Prompted speech (such as collected here) is not "spontaneous", otherwise the WSJ recognizer would not have given 20 % WER (this aspect is irrelevant for the purpose of this paper, though, I think). Typically, multiple captions are being generated for a single image. Has this been done here as well? Or is there only a single caption for each image?

caption, spoken language, unsupervised learning, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Speech (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.40)

Add feedback

The World's Top 10 Most Spoken Languages

#artificialintelligenceAug-14-2022, 23:58:16 GMT

The Amazon growth story has been a remarkable one so far. On the top line, the company has grown every single year since its inception. Even in going back to 2004, Amazon generated a much more modest $6.9 billion in revenue compared to the massive $469 billion for 2021. Most of these sales come from their retail and ecommerce operations, which the company has come to be known for. That's because 74% of Amazon's operating profit comes from Amazon Web Services (AWS).

amazon, spoken language, world

#artificialintelligence

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Cloud Computing (0.80)
Information Technology > Artificial Intelligence > Speech (0.40)

Add feedback

Ranked: The 100 Most Spoken Languages Around the World

#artificialintelligenceFeb-21-2020, 01:14:06 GMT

Even though you're reading this article in English, there's a good chance it might not be your mother tongue. Of the billion-strong English speakers in the world, only 33% consider it their native language. The popularity of a language depends greatly on utility and geographic location. Additionally, how we measure the spread of world languages can vary greatly depending on whether you look at total speakers or native speakers. Today's detailed visualization from WordTips illustrates the 100 most spoken languages in the world, the number of native speakers for each language, and the origin tree that each language has branched out from.

native speaker, second language, total speaker, (8 more...)

#artificialintelligence

Country:

South America (0.05)
North America > Central America (0.05)
Africa > Uganda (0.05)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Speech (0.65)

Add feedback

Unsupervised Learning of Spoken Language with Visual Context

Harwath, David, Torralba, Antonio, Glass, James

Neural Information Processing SystemsFeb-14-2020, 09:58:22 GMT

spoken language, unsupervised learning, visual context

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Which Is The World's Most Spoken Language? Terpene. What's That, You Ask?

International Business TimesApr-15-2017, 10:50:02 GMT

China, the most populous country in the world, has close to one billion people that speak Mandarin. Spanish is spoken by a less than half that number, primarily in Mexico, Spain and the countries in South America. English follows close behind, with Hindi in India and Arabic in the Middle East making up the top five. Or so you would think. The most common language in the world is actually not human at all.

artificial intelligence, spoken language, terpene, (9 more...)

International Business Times

Country:

South America (0.26)
North America > Mexico (0.26)
Europe > Spain (0.26)
(6 more...)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Speech (0.40)

Add feedback